Highly Optimized Tolerance and Power Laws in Dense and Sparse Resource Regimes 
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Power law cumulative frequency (P) vs. event size (I) distributions P(> I) ~ l~ a are frequently 
cited as evidence for complexity and serve as a starting point for linking theoretical models and 
mechanisms with observed data. Systems exhibiting this behavior present fundamental mathemat- 
ical challenges in probability and statistics. The broad span of length and time scales associated 
with heavy tailed processes often require special sensitivity to distinctions between discrete and 
continuous phenomena. A discrete Highly Optimized Tolerance (HOT) model, referred to as the 
Probability, Loss, Resource (PLR) model, gives the exponent a = 1/d as a function of the dimension 
d of the underlying substrate in the sparse resource regime. This agrees well with data for wildfires, 
web file sizes, and electric power outages. However, another HOT model, based on a continuous 
(dense) distribution of resources, predicts a = 1 + 1/d. In this paper we describe and analyze a 
third model, the cuts model, which exhibits both behaviors but in different regimes. We use the 
cuts model to show all three models agree in the dense resource limit. In the sparse resource regime, 
the continuum model breaks down, but in this case, the cuts and PLR models are described by the 
same exponent. 



I. INTRODUCTION 

In this paper we analyze a family of abstract, mathe- 
matical models which have been used to illustrate Highly 
Optimized Tolerance (HOT) H H 0, H @ , a mecha- 
nism for complexity based on robustness tradeoffs in sys- 
tems subject to uncertain environments. HOT systems 
abound in nature and modern technology, and are com- 
plex and highly structured. They arrive at "optimized" 
or "organized" states through deliberate design or bio- 
logical evolution, and exhibit robust, yet fragile (RYF) 
characteristics, the essence of HOT. That is, they are ro- 
bust to normal or common perturbations, yet may be ex- 
tremely fragile to rare perturbations or design flaws, even 
if the perturbations are small and seemingly innocuous. 

Recently, HOT has been investigated in the context 
of a variety of specific applications, including the Inter- 
net HQ, the Electric Power Grid M, Wildfires E3 , and 
Biological Networks El El El El El El E3~Typi- 
cally, these studies involve a combination of simple ab- 
stract, analytically tractable representations, which focus 
on fundamental tradeoffs and derivations of the power 
laws, with detailed, high-resolution simulation models, 
aimed at pinpointing specific system and model fragili- 
ties. Here we focus specifically on the abstract models 
which have been used to describe HOT. We compare dis- 
crete and continuum models in a common framework, 
and clarify the approximations that are made and the 
ranges of applicability of the models. This forces us to ad- 
dress certain fundamental issues in probability and statis- 
tics, including distinctions between discrete and contin- 
uous distributions, and properties associated with mix- 
tures of distributions. 

One key success of HOT is to offer an alternative per- 
spective on the origins and ubiquity of complexity, and 



particularly power laws. Mathematically, heavy tailed 
distributions (e.g. power laws) often require special care 
because of the broad range of spatial and temporal scales 
over which data is sampled [H, El El El El Ei . In 
many cases, conventional assumptions and methodolo- 
gies associated with modeling and data analysis are mis- 
leading and/or break down. One of the goals of this 
paper is to illustrate how such problems can arise, and 
to approach them in a manner which is mathematically 
rigorous. 

HOT has been compared to earlier work emphasiz- 
ing emergent complexity, where power laws arise from 
minimal tuning, on an otherwise random substrate. In 
emergent complexity power laws are associated with frac- 
tals and self-similarity [3 E3 ■ ^ n many studies, HOT 
illustrates the differences between organized and emer- 
gent complexity by using percolation forest fire models 
from physics |2& 123, |2£| , but including a minimal form 
of optimization (intended to capture design or evolution) 
and robustness tradeoffs [lj, |2, Li Li l2i| ■ This produces 
power laws (in better agreement with data) that arise 
from highly organized and self-dissimilar structures, the 
opposite of self-similarity. 

All of the abstract HOT models follow the same basic 
mechanistic description involving optimization of trade- 
offs in an uncertain environment. Each begins with a 
d-dimcnsional substrate representing the system. Each 
event (e.g., a power outage or fire) is triggered by some 
small perturbation or spark (typically chosen from a 
nonuniform distribution) which initiates a cascading fail- 
ure, resulting in loss of some portion of the substrate. All 
of the models considered here assume the loss (or cost) 
associated with an event scales linearly with the event 
size. Alternative cost functions give power laws in cost, 
not necessarily raw event size Thus cost functions 
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that heavily weight large events can lead to truncation 
of the power laws . 

In HOT, resources are allocated to create barriers lim- 
iting propagation of the cascading failure events in a 
manner which optimizes the cost function (minimizing 
loss or maximizing yield). There is a limited number of 
resources available, and this constraint is modeled in one 
of two ways. The first method places a fixed limit on the 
total resources available. The second weights resource 
use alongside other costs or losses, which are associated 
with the events themselves, by including an explicit re- 
source term in the cost function. Here the key issue is to 
account explicitly for resource use. The specific form of 
the constraint does not play a significant role in deter- 
mining the size distribution. 

In HOT, optimization of the resource allocations sub- 
ject to the constraint represents design and/or evolu- 
tionary tradeoffs in systems faced with a spectrum of 
disturbances. Because resources are constrained and of- 
ten sparse or expensive, optimal solutions make efficient 
use of the resources available, resulting in HOT states 
characterized by structured, compact, d-dimcnsional re- 
gions surrounded by (d— l)-dimcnsional barriers. In ad- 
dition, for a broad class of distributions of disturbances 
(e.g. Gaussian, exponential, and Cauchy), minimization 
of the average loss results in heavy-tailed, power law dis- 
tributions in the sizes of the events. Newman et al. 
emphasize however that the specific exponents charac- 
terizing the decay of the power law distribution in HOT 
models can be different. 

In this paper we focus on three models for HOT which 
are among the simplest, and most analytically tractable 
examples. Table [i] summarizes their basic properties, 
which will be described in detail in the following sections. 
In each case, 

• Probability p: represents uncertainty in the envi- 
ronment. 

• Loss I: represents the volume or size associated 
with an individual event, which is directly propor- 
tional to the cost of that event. 

• Resources r: provide mechanisms to limit losses. 

• Constraints: are imposed on the resources. 

• Optimization: of the resource assignments subject 
to constraints leads to the HOT state. 

• Power Laws: in the cumulative event distributions, 
P(> I) vs. I, are characteristic of these optimal 
solutions. 

All of these models are motivated by studies of the HOT 
version of the percolation forest fire model 0, Q • The 
most well studied of these are the continuum model 
generalized by Newman ct. al. [(J, and the Probability 



Loss Resource (PLR) |3| model. Their abstractions dif- 
fer in subtle, yet important ways, leading to differences 
in the predictions. The continuum model aims to de- 
scribe the continuum limit of the HOT percolation forest 
fire mod el 111 , building on lattice models from statistical 
physics [23, yfj , and introducing a mean-field-like anal- 
ysis of the continuum limit. In the continuum model all 
aspects of the system are described as smoothly varying 
functions on the substrate. The PLR model is a general- 
ization of Shannon Source Coding Theory |3l| from Infor- 
mation Theory [3^] 1 perhaps the simplest design model in 
engineering. The PLR model begins with discrete event 
categories i, each of which has a characteristic proba- 
bility, resource allocation, and resulting loss. Like the 
continuum model, the cuts model 0, Q] can be thought 
of as the limiting description of a lattice model as the 
lattice size becomes infinite. The cuts model represents 
space continuously (like the continuum model) but di- 
vides it into discrete regions (like the PLR model) using 
sharp barriers, i.e. cuts. 





Continuum 


PLR 


Cuts 


Probability 


Continuous 
p(x) 


Discrete Pi 


Continuous 
p(x) cut into 
Pi 


Resources 


Continuous 
r(x) 


Discrete r. 


Discrete cuts 


Constraint 


Resource cost 
R = J r(x)dx 


Resource limit 
!><-R 


N cuts 


Losses 


Continuous 
l(x) 


Discrete U 


Discrete U 


Optimize 


Y = l-Jpl-R 






Power law 

P(>1) vs. 1 


j-(i+i/<9 




r 2 as I -» 0, 
l^ 1 as I — > 00 
(d = l) 



TABLE I: The HOT continuum, PLR, and cuts models pre- 
dict power laws based on optimal allocation of limited re- 
sources to minimize loss in an uncertain environment. Differ- 
ent assumptions in the continuum and PLR models lead to 
different exponents in the dense and sparse resource regimes, 
both of which can arise as (opposite) limits of the cuts model. 
The PLR model can be extended to the dense resource limit 
(Section V), where it agrees with the continuum and cuts 
model. The PLR cumulative probability P(> I) assumes 
densely sampled data (Section III). To increase readability, 
constant factors are set to unity in the equations for Yield Y 
which is optimized. 

A key distinction between the models is their predic- 
tions for power law exponents. The continuum model 
predicts a power law in P(> I) with exponent a = 1 /d+1, 
while PLR predicts an exponent of a = 1/d for the same 
distribution of sparks (assuming densely sampled data). 
We show the first two models match solutions in different 
limits of the cuts model with an exponential distribution 
of sparks. In d = 1 the cuts model predicts an expo- 
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nent of a = 2 in the limit of small events and a = 1 
in the large-event limit. Thus the cuts model captures 
the power laws predicted by the other two models in the 
limit of small (continuum) and large (PLR) events. 

Analysis of the cuts model provides a unifying picture 
for all the models, and a concrete illustration of how cer- 
tain key approximations made in the first two models 
can break down. We show that when the PLR and cuts 
models have sufficiently similar assumptions, their results 
agree as expected. In the dense resource regime limit (de- 
scribed by the continuum model), all three models agree. 
The cuts model also illustrates how the exponent describ- 
ing small events departs from this dense resource limit as 
the density of resources and barriers becomes lower. 

In the remaining sections of this paper, we first sum- 
marize results for the continuum (Section II) and PLR 
(Section III) models, with special attention to deriva- 
tion of the power laws, and specific features which will 
be useful for comparing models. We also discuss math- 
ematical subtleties which can arise in taking continuum 
limits in systems with sharp barriers, as well as mathe- 
matical issues which can arise in comparing continuum 
vs. discrete models and distributions, and distributions 
composed of finite mixtures of probability distributions. 
The next three sections comprise the bulk of the new an- 
alytical results in this paper. In Section IV we review 
and extend the cuts model and in Section V we compare 
it to the other models. We show that the event size dis- 
tribution for the PLR and cuts model agrees when their 
assumptions are forced to be similar. They also both 
agree with the continuum model in the limit of dense re- 
sources and small event sizes. For the cuts model with 
a power law distribution of sparks, the small event limit 
is described by a power law in which the exponent de- 
pends on the distribution of sparks, ranging from the 
limiting value of a = 2 (which we obtain for an expo- 
nential spark density) for an infinitely steep power law, 
to a = 1 (sparse resource limit, and in agreement with 
PLR) when p(x) ~ 1/x. Furthermore, for both expo- 
nential and power law distributions of sparks, we find 
that the event size distribution for the cuts model agrees 
with the PLR model in the limit of large event sizes, 
where the distribution is clearly discrete. In this case the 
agreement between models depends on the assumption 
of a sufficiently well sampled data set, which would only 
arise in the cuts and PLR models due to mixtures. In 
Section VI we return to the original HOT lattice model, 
and illustrate a subtle pathology which arises in the con- 
tinuum limit of the lattice model in the absence of an 
explicit resource cost or constraint. In Section VII we 
conclude with a discussion of our results, and the rele- 
vance of the different resource regimes in the context of 
observed data. 
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FIG. 1: Sample configuration of the percolation forest fire 
model in d = 1. Occupied sites (black) correspond to trees, 
and vacant sites (white) correspond to firebreaks. When a 
spark hits an occupied site it burns all trees in the connected 
cluster (labeled h) of occupied sites containing the initiating 
site. Fire terminates in each direction upon encountering a 
firebreak, or cut, labeled Ci. 

II. CONTINUUM MODEL 

The continuum model was first suggested as an approx- 
imate limiting description of the HOT forest fire perco- 
lation lattice model by Carlson and Doyle []J. It was 
later studied alongside large lattice model simulations by 
Newman et al. 0. The definition of the model most con- 
veniently begins with the lattice model, which we will 
return to in Section VI. Strictly speaking, the continuum 
model is an approximation to the lattice model based on 
scaling arguments. It captures the power laws observed 
in the HOT lattice model in the limit of large, finite lat- 
tice sizes, and allows the size distribution to be calculated 
analytically. 

Consider a d— dimensional space, with positions la- 
beled by the d— dimensional vector x (these are discrete 
sites on a hypercubic lattice, each labeled by d integer 
indices i,j,k..., with x = (i/N,j/N,k/N....), where N 
is the number of sites along each axis of the lattice). 
In the percolation lattice model, each position (site) is 
either occupied by a tree, or vacant (firebreak). Envi- 
ronmental uncertainty is represented by the probability 
p(x) that a spark lands at site x. A spark ignites a fire 
that spreads throughout the nearest neighbor connected 
cluster of trees in all d directions, but terminates at fire- 
breaks. The resulting fire size is the total number of sites 
in the burned patch, Z(x), and the value of Z(x) is clearly 
constant within each contiguous patch. A sample lattice 
configuration in the special case d — 1 is illustrated in 
Figure ^ Occupied sites (black) are trees and unoccu- 
pied sites (white) are firebreaks. Event sizes U correspond 
to the number of occupied spaces between firebreaks, or 
cuts, labeled Ci. 

HOT configurations optimize the layout of vacant and 
occupied sites to maximize yield Y , defined to be the av- 
erage number of occupied sites which remain after a sin- 
gle spark lands on the lattice (averaging over the spark 
distribution p(x)). For small lattices, it is possible to 
compute the globally optimal solution |f|. However, for 
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large lattices the solution becomes computationally in- 
tractable (and not especially informative). Instead, a 
wide variety of constrained optimization schemes have 
been investigated [l|, 0, la, Ujj, H3 1 au leading to similar 
results. Firebreaks are concentrated in regions of high 
spark probability, so that only small fires occur in regions 
of the lattice where sparks are common, while large fires 
occur in regions where sparks are rare. 

The specialized, patterned HOT configurations reflect 
patterns in the perturbing environment. This is in sharp 
contrast to the traditional forest fire percolation model 
studied in statistical physics |3oj |. where configurations 
are essentially random, aside from a tuned, or "self- 
organized" average critical density |51 El. El El El], 
The contrasts between the HOT and self-organized criti- 
cal lattice models are discussed in detail in Q, 0, 0, 0, HJ , 
and will not be our emphasis here. 

The HOT lattice model was the first model introduced 
to illustrate the HOT mechanism, and is pedagogically 
useful in illustrating the emergence of (d— l)-barriers on 
the d— dimensional substrate, as well as the high concen- 
trations of barriers in regions where perturbations are 
common. All of the other models considered here re- 
tain these key features, but each explicitly accounts for 
the cost of resources in a different way. More impor- 
tantly, each makes different approximations in represent- 
ing continuum versus discrete spatial features of the lat- 
tice model, which lead to the different predictions for the 
event size distribution. 

In the continuum model the integer i/N components of 
the d— dimensional vector positions x are replaced in the 
limit N — > oo by real valued components. The occupied 
(tree) and vacant (firebreak) lattice sites are replaced by 
a resource density r(x), representing the local density of 
firebreaks. A function Z(x) represents the size of the loss 
which occurs when a spark lands at position x. A key 
approximation relative to the original lattice model is 
clearly made in the continuum model, which represents 
r(x) and Z(x) as continuous functions. The idea is to 
use a scaling relation, motivated by the lattice model, to 
mimic the manner in which higher resource densities lead 
to smaller fires in a given region, without accounting in 
detail for the specific configuration. 

To derive the distribution of fire sizes for the contin- 
uum model, we follow the elegant derivation of Newman 
et al. jyj. The size of a firebreak surrounding a given 
patch Z(x) is: 

r(x) = gdl(x)<> d - 1) / d , (1) 

where g is a geometric factor of order unity that depends 
on the shape of the patch. It is in Eq. that the dimen- 
sional relationship between resource and loss is captured. 
In the continuum model the total resource use is given 
by 

R = [ r(x)dx, (2) 




x 



FIG. 2: Schematic solution of the continuum model in d — 1. 
Small event sizes Z(x) are associated with positions x of high 
spark probability p(x). Eliminating x and integrating leads 
to the event size distribution P(> I) as described in the text. 
A key distinguishing feature of the continuum model is that 
the event size function Z(x) is a priori a continuous function 
of x. 

where the integral is over the d— dimensional substrate. 
In the continuum model, this cost enters explicitly into 
the yield function. Normalizing Y by the total volume of 
the substrate (i.e. Y = 1 corresponds to a fully occupied 
forest, with no fires or firebreaks), and averaging over the 
distribution of sparks p(x), we write the expected yield 
as 

Y = 1 - c J p(x)Z(x)cZx - aR (3) 

where c is the cost per unit area (or generally, d- 
dimensional volume) of forest, and a is the cost per unit 
length (or (d— I)-dimcnsional volume) of firebreaks. This 
yield function is motivated by tradeoffs inherent in the 
original lattice model, where the resources are empty 
sites, and the cost of firebreaks is the yield penalty in ini- 
tial density associated with creation of vacancies. How- 
ever, unlike the lattice model, it includes a nonvanish- 
ing resource term explicitly in the yield function, and 
allows the constants a and c to scale differently with di- 
mension d. This fortuitously omits a pathology which 
results from the difference in scaling between the com- 
pact, d— dimensional clusters of trees, and the (d — 1)- 
dimensional firebreaks which arises in the lattice model 
as N — > oo. We discuss this in more detail in Section VI. 

The optimal allocation of resources r(x) maximizes the 
expected yield. Optimizing resources is equivalent to op- 
timizing over event sizes because they are explicitly re- 
lated via Eq. 0). To obtain the solution, we assume that 
Z(x) is a continuous function of the ignition site x, and 
set the functional derivative 6Y/Sl(x) equal to zero. This 
leads to 

Z(x) = C P {-x)- d/{d+1) (4) 

where C is a constant that depends on a,c, and g. 

A schematic solution in d — 1 is illustrated in Figurc|21 
It is important to note that the continuum model departs 
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from the original lattice model in representing Z(x) as a 
continuous function of x. For a given configuration in 
the lattice model, Z(x) assumes a constant, finite value 
for each contiguous cluster of occupied sites. Therefore 
Z(x) in that case, is piecewise constant. The continuum 
model represents Z(x) as continuous over the entire space, 
leading to Eq. Q}. It is the only one of the models we 
consider which builds in this assumption. 

It is also possible to calculate the probability density 
pit) of fire sizes for the continuum model. Again, assum- 
ing Z(x) is continuous we obtain 0,0: 



P(l) 



p(x)4F=p(x)$$ 



p(x 



d a x 



(5) 



where C is a constant that depends on d,c,a, and g. 
Newman et al. thoroughly investigated the behavior of 
p(l) and found that the scaling behavior is dominated by 
the factor of l^( 2+1 / d \ while the factor p(x)^ gener- 
ates at most logarithmic corrections for a broad class of 
probability distributions p(x) 

Since the probability density p(l) is continuous, the 
cumulative distribution of events of size greater than or 
equal to I, P ( > i ), is proportional to 
Therefore, for a one dimensional substrate, the contin- 
uum model predicts a slope of a = 2 for the cumulative 
distribution of events. Table [3 summarizes the properties 
of this model. 



III. PROBABILITY LOSS RESOURCE MODEL 
(PLR) 

The PLR (Probability Loss Resource) HOT model is 
a generalization of Shannon Source Coding Theory for 
data compression |3l]], the simplest, most elegant design 
theory in engineering. It is the simplest model illus- 
trating HOT [3|, and is based on optimal allocation of 
limited resources, with an explicit, fixed cap on the to- 
tal resources available. It retains a dimension-dependent 
relationship between resources ({d — l)-dimcnsions) and 
loss (cZ-dimcnsions), but otherwise replaces the explicit 
spatial variable x with a more abstract notion of event 
categories i. The idea is to group similar conditions, from 
the common to the rare, into a category, represented by 
the relative probabilities Pi . 

The PLR objective is to allocate resources in a manner 
which maximizes yield Y averaged over a spectrum of 
possible events: 



Y = l-cJ2Pih\ k = f(n), J2 r ^ R - 



(6) 



Here c is a constant, and i, 1 < i < N, indexes the fi- 
nite and discrete set of probabilities pi, assumed to be 
in descending order, with corresponding loss Zj. Normal- 
ized, the cumulative P(> h) = ^2j>iPj is the rank or- 
der divided by the total number of events in a data set, 



from which corresponding values of pi may be deduced. 
We will interpret the p, as probabilities, so Y is average 
yield, but in general the pi could be any weights assigned 
to create a cost function. 

The probability pi of each category is fixed, and a to- 
tal resource allocation r 2 ; is made to the event category 
i, resulting in events of size U for the category (i.e. Vi 
is the total resource allocation to all the events in event 
category i). The r, are chosen to minimize the average 
event size, averaged over the spectrum of possible condi- 
tions {i}. The only interaction between events is that the 
sum of all resources is limited by ^ n < R. This means 
that any reasonable design will devote more resources to 
the categories of common events so that they yield small 
losses, leaving relatively few resources for rare events. 

Unlike explicitly spatial lattice models, the PLR model 
presumes a mean-field-like independence of events. How- 
ever, a lattice abstraction (which should not be inter- 
preted as a literal gridding of the forest) can be used to 
derive the relationship Zj = f(fi) between resource alloca- 
tion and loss for the event categories {i}. Imagine a large, 
finite d— dimensional lattice which is an abstraction of a 
space representing a single condition category i. The lat- 
tice is of length L on each side, and the total volume L d 
serves as the large scale cutoff, i.e. the size of the largest 
possible event. The value of pi is the total probability 
of hitting any part of the lattice for category i, and the 
probability of hitting any one of the cells within category 
i is equal. Resources r, represent the total allocation of 
vacant sites within the i th category. 

Because the spark distribution pi is uniform within 
each category, the optimal use of resources (vacant sites) 
defines a collection of equally spaced (cZ— l)-dimensional 
surfaces, one lattice spacing wide in the remaining dimen- 
sion, on an otherwise occupied lattice. This defines a set 
of compact, contiguous cells, all of equal size Z.j, for cat- 
egory i. For example, in d — 1, the barriers correspond 
to a single unoccupied site between contiguous occupied 
sites of equal length. This is similar to the lattice shown 
in Figure E except the occupied regions Zi, Z2, . . . would 
all have the same length. 

Suppose a resource allocation of size fj = J^d L£ (num- 
ber of vacancies) is made to category i, arranged as £ 
equally spaced cuts, spanning the full length of the lat- 
tice L in each dimension d. Then the event size U for 
category i is U = ((£/£) — l) d . Eliminating £ yields a 
relationship between event size U and the resource allo- 
cation fj, which scales like U ~ ri~ d . Here L is simply 
the constant subregion lattice length scale, and the key 
result is the dimensional relationship between resource 
allocation (to the event category as a whole) and the 
corresponding characteristic loss size for that category. 

This process is illustrated for d — 1 in Figure |21 Three 
event categories (i = 1,2,3) are shown, in order of de- 
scending probability p\ > pi > P3- Here the constant 
subregion size L is the identical horizontal length of the 
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i=3 



and 



P 3 



h 

-L 



FIG. 3: Resources allocated to event category i in the PLR 
model in d = 1 divide a region of fixed length L (horizon- 
tal axis) into events of equal length U, characteristic of the 
category. In the optimal solution, regions of high probabil- 
ity (vertical axis) are allocated more resources, resulting in 
smaller events. 



line segment associated with each region. The vertical 
height of each box reflects the probabilities pi (and is not 
related to any spatial dimension or length scale). Re- 
sources r-j are allocated to each region, with r\ > r 2 > r 3 , 
and divide each region into line segments U equal size, 
with li < I2 < h- 

Incorporating a cutoff at small event sizes, and normal- 
izing so that < r, < 1 with f(l)=0, in d— dimensions 
we write 



fin) = J(rr d 



1), d>0, 



(7) 



which incorporates the scaling determined above, and 
uniquely determines /(r^) up to the parameter 7. As in 
the original Shannon Theory, we relax the constraint that 
the Ti take integer values. This is an extremely simple 
and tractable model with essentially only one parameter, 
the dimension of the substrate, where events are charac- 
terized by d— dimensional, compact regions, enclosed by 
(d — l)-dimensional perimeters. 

Given a fixed resource budget R, the goal outlined 
in Eq. JSJ is to optimize the division of resources to 
maximize yield, by minimizing the expected loss JZiPi^i 
subject to the resource vs. loss relationship in Eq. J7J). 
This is accomplished using standard constrained opti- 
mization methods (Lagrange multipliers). Setting the 
gradient of \Q2 r i — R) + J2Pif( r i) equal to zero yields 
—pif'{ri) — A, which equalizes the expected marginal 
loss and can be solved for the rj. Then the optimal A 
saturates the resource constraint with r i — R> r i < lj 
yielding 



Rvl 



so that 



RpT 



-d 



(8) 



(9) 



J = d~ 



R- d ij2pr 



(10) 



Inverting yields a relationship between the event type 
and corresponding probability: 



Pi{k) = -(C + ii)-< 1+a > 



(11) 



where a = 1/d and C is a constant (which depends on 
7, d, and R in Eq. Q) which sets the small size scale in 
the resource vs. loss relationship. For simplicity, we will 
assume throughout that C is sufficiently small that we 
can neglect any small size cutoff. 

The PLR model is defined in terms of noncumula- 
tive probabilities pi, but to reliably compare with data 
it is necessary to use cumulative distributions. Since 
Pi oc (li)~( 1+a * > (Eq. ((TTJl ). the naive expectation is that 
the cumulative distribution P(> U) oc (li)~ a . However, 
this is not necessarily the case for discrete data sets, 
where cumulative distributions are attained by summing, 
rather than integrating the density. In fact, in the dis- 
crete case, the cumulative distribution can be steeper, 
shallower, or have the same decay properties as the den- 
sity, depending on how densely the data is sampled. 
Thus, unlike the case of a continuous probability density, 
there is no general relationship between discrete prob- 
ability distributions and their noncumulative densities. 
We cannot simply assume that since Pi(U) is a power law 
with slope — (1 + a), that P(> k) is a power law, let alone 
with slope —a. This issue is fundamental in the theory 
of discrete probability distributions, and also arises for 
the cuts model (Section IV), which is also inherently dis- 
crete), and in comparing PLR with the continuum and 
cuts models (Section V). 

Furthermore, in making comparisons with data, use of 
the density, rather than the distribution does not solve 
the problem. Use of the cumulative distribution is in fact 
preferable, because it avoids statistical anomalies associ- 
ated with binning. The cumulative distribution simply 
corresponds to a normalized plot of the ranked (by size) 
order of events in a catalog, which does not introduce 
any statistical biases. 

Although the PLR model can be used to generate a 
cumulative event probability function P(> I) which is in- 
herently discrete, most data sets exhibiting power laws in 
the cumulative event probability as a function of size are 
sufficiently dense to exhibit a fairly convincing unit dif- 
ference in slope between the density and the cumulative 
distribution. This leads us to determine circumstances 
under which the naive expectation of unit difference in 
the exponent between the cumulative distribution and 
the noncumulative density is in fact correct. 

This requires sampling in the data set which is suf- 
ficiently dense that integration of the density to obtain 
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the cumulative distribution is a good approximation to 
computing the discrete sum. One possible explanation 
is to hypothesize that most data sets are mixtures from 
many different systems, or the same system averaged over 
long times. Thus a complete treatment of how to assess 
whether data is consistent with a PLR mechanism ulti- 
mately requires a treatment of mixtures |33l l34j . 

The simplest scenario corresponds to a mixture of dis- 
crete power law distributions with the same exponent. 
This generates a power law with that same exponent, 
but possibly different short- and large-scale cutoffs, and 
provides a simple and unambiguous way to connect the 
PLR Pi oc (Z0" (1+a) with P{> k) oc (li)- a . This sce- 
nario assumes sufficient data up to some cutoff size L, 
binned with fixed A/, to treat the resulting pi as binned 
samples from a continuous density. Then we can define 



p(>h) = E(ij+cr a ~ 1 Qj+i 

= E (h + cr a - 1 Ai 



(12) 



which in the limit of large data sets approximates a con- 
tinuous P(> I) satisfying 



P(> l)<xfp(x)dx =J(x + C) 
i i 

oc ((J + Cy a - (L + C)" Q ) 



-a— 1 



(13) 



leading to the exponent a = 1 in d = 1. Table U assumes 
these properties of the PRL model. Note, however, that 
when the PRL model is used and the k are not densely 
sampled, then the above calculations for the cumulative 
distribution need not hold. 



IV. CUTS MODEL 

The cuts model 0, is a simple, analytic model that 
helps clarify the discrepancy between the power law expo- 
nents predicted by the continuum and PLR models. We 
focus on d = 1 for this case and the comparisons. Higher 
dimensional generalizations of the cuts model arc possi- 
ble, but correspond to constrained optimization schemes 
(e.g. the grid design problem in 0) or choices of p(x) 
with special symmetries. As we show below, the other 
two models as formulated above agree with the cuts 
model in (different) asymptotic regimes. We also use the 
cuts model to formulate an extension of the PLR model 
that describes the dense resource limit, where all three 
models agree. 

Like the continuum model, the cuts model is naturally 
understood as a continuum limit of a percolation lattice 
model, but it is a variant of percolation which includes 
an explicit constraint on the resources, as in PLR. The 
cuts model removes the assumption that the event sizes 
l{x) are nearly continuous (an approximation made in 




FIG. 4: Illustration of cuts model mapping from probability 
function p(x) which is a continuous function of the spatial 
coordinate x to a discrete set of probabilities Pi. The cut 
positions chosen to optimize a yield function, Y or Y l . 



the continuum model), which makes it possible to span 
both the dense and sparse resource regimes in a single 
formulation of the model. 

Consider a percolation forest fire lattice model in one 
dimension. Resources are vacancies that act as dividers 
or cuts between connected clusters of occupied sites. An 
example of this is shown for d = 1 in Figure ^ If we 
take a continuum limit by rescaling into a finite interval 
and taking the number of lattice sites to infinity, the 
the cuts become infinitesimally thin, zero dimensional 
dividers between continuous connected regions of unit 
density. 

The cuts model is defined on position space x, x G 
[0, A] c 5R, where A is the large-scale cutoff. A discrete 
set of zero-dimensional cuts divide the axis into a set of 
separate one-dimensional line segments. The model im- 
poses the constraint that the maximum number of cuts 
is a natural number N. Analogous to the PLR model's 
explicit constraint on total resources (E r i ^ °P~ 
timal solutions make full use of all available resources 
(J2 r i = -R in PLR and # cuts=iV in cuts). Events 
arc triggered (sparked) according to a spatial probability 
function p(x) as in the continuum model, propagating 
along the connected cluster, between adjacent cuts. The 
position of the ith cut is labeled Ci and Co is at x = 0. 
The cut positions define discrete line segments k and the 
corresponding event probabilities pc 



k = 



Cj_i 



and pi = I p(x)dx. 

ICi-l 



(14) 



In other words, the cuts map the continuous spatial func- 
tion p(x) defined on [0, A] c 5ft into a discrete set of events 
with probability pi given by the cumulative probability of 
sparking the segment of length li between adjacent cuts. 
This mapping is illustrated in Figure 0| 

Carlson and Doyle maximized the yield function 



Y = 1 



(15) 
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with respect to the cut positions. Note that this is the 
same yield function used in the PLR model. They found 
an iterative solution for the optimal cut positions in the 
continuum limit: 



%>i + k p(a) = Pi+i - i l+ ip(ci). 



(16) 



Unfortunately, analytic solutions to this equation involve 
transcendental functions even if there is a simple func- 
tional form for p{x). 

The problem simplifies if we consider a slightly mod- 
ified cost function, replacing J = Y] Pjlj in H15J1 . with 
J* = ^2iP\li, where p\ is the probability of events of in- 
dex greater than i, p\ = ^2^ =i Pj- This cost function can 
be naturally motivated in many cases, such as web lay- 
out 0- Furthermore, as we show below, results obtained 
for the power laws using this modified cost function are 
equivalent to the original cost function in the small and 
large size asymptotic regimes. 

With the modified cost function, we can define the 
yield as 



Y 1 



1 



(17) 



and optimize the yield with respect to the cut positions 
Ci by setting dY /dci = 0. Using the definitions from 
Eq. (|14f> . the following iterative equations hold for the 
optimal cut positions: 



Pi = k+i p{ci)- 



(18) 



This equation is simpler to iterate than Eq. (flfift . Its 
solutions are no longer transcendental functions, and op- 
timal U for general p(x) can easily be found using simple 
numerical techniques. Note that the number of cuts N 
does not appear explicitly in the recursion relation. In- 
stead, the equation requires two initial cut positions, c; 
and Ci-\ (which is the lower limit of integration for the 
integral defined as pi). These initial cut positions define 
a length scale, li = ci — Ci_\. This length scale together 
with the large-scale cutoff, A determine the total number 
of cuts, N. Therefore, choosing two initial cut positions 
is equivalent to specifying N for a fixed A. 



Cuts model for an exponential distribution of sparks 

To solve the recursion equation analytically we first 
choose p(x) = Xe~ Xx , which leads to an especially simple 
solution: 



Pi 



-Xd- 



(19) 



As with the other two models, we are interested in the 
probability distribution of event sizes p(h) and the cu- 
mulative probability distribution P(> li). In this case, 
solving for P(> h) is transparent; 



p\- 



(20) 
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FIG. 5: Event size l(x) as a function of x for the cuts model. 
When the value of l(x) is small (and x is small) the function 
is nearly continuous but when the value of l(x) is large the 
function is piecewise constant, displaying obvious discontinu- 
ities. 



We substitute p(x) into Eq. I|18fl to find a recursion rela- 
tion for the optimal region sizes: 



i+i 



A 



(21) 



Notice that the event sizes increase exponentially as li 
becomes large. We use li to construct the function l(x) 
which is defined as the event size li when a spark hits site 
x. This function is piecewise constant between cuts, as 
illustrated in Figurc[S] For large x, the function exhibits 
large discontinuities. For small x, while still discrete, it 
approaches a continuous function. 

The slope of P{> h) on a log-log plot can be easily 
calculated in limiting cases by substituting Eq. I|21(l into 
Eq. (|T$)l . dividing Alogp* by AlogZ,;, Taylor expand- 
ing, and dropping higher order terms. The limiting case 
describing the large event sizes, with sparse resource al- 
locations, is discussed in Q. Following the derivation 
there: 



;,-»oo log/j + i - log^ 
= lim 

li — >oo 



= lim 



log e 



-loge 



-Xli 



log(e A 



-l)-logAZ; 



1 (22) 



The opposite limiting case, describing small events, and 
high resource densities, can also be calculated. We find: 



logp-+i 



li^O l0g/i + i - log 2,; 



lim — 

h->0 lop 



lim - 

i og (i 



-Xh 



(23) 



2XL 



We can also investigate the asymptotic behavior for 
small and large events in this model numerically by 
choosing two initial cut positions Ci and c,_i and then 
iterating Eq. l(T%|l backwards and forwards. The cumula- 
tive probability P(> h) (large circles) vs. the event size 
li is shown in Figurc|^a) and (b), and the limiting power 
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FIG. 6: Numerical results for the cuts model with an expo- 
nential distribution of sparks. Figure (a) shows the cumula- 
tive probability P(> I) (Large circles) and probability density 
p(l) (small squares) vs. event size I for the one dimensional 
exponential cuts model (i.e. p(x) = \e~ Xx ) with the modified 
yield function (using p ( ). Points are calculated from Eq. 11811 
iterated backwards and forwards from the initial cut positions 
Cj = 300,c;_i = 250. The 4th iteration forwards results in a 
data point too large to compute. The solid line illustrates a 
power law with exponent —2, and the dashed line illustrates 
exponent — 1. Figure (b) is an enlarged view of (a) in the re- 
gion where I is small. For small /, the cumulative probability 
has a steeper slope than the probability density, but for large 
I their slopes are the same. Again, the solid line illustrates 
exponent —2 and the dashed line is —1. 




Event size I 



FIG. 7: Cumulative probability P(> I) vs. I for a mixture of 
data sets with slightly different large scale cutoffs, A. These 
data sets are generated as in Figure but sampling the ini- 
tial, seed cut positions randomly, and combining data fro the 
different choices. The second cut position Ci was chosen from 
a uniform distribution on [200, 300] and the distance between 
the first and second cut Ik was chosen from a uniform distri- 
bution on [25, 50]. Note: a-i = Cj — U. 



lative probability can be found by integrating the den- 
sity. For example, if p(x) is a power law with exponent 
— (a + 1), then the cumulative probability is a power law 
with exponent —a, as we intuitively expect. This simple, 
intuitive result also applies when data consists of a set of 
discrete probabilities pi which are sufficiently dense that 
we can use them to derive a continuous probability distri- 
bution as we did in the PLR model Eq. (|12|l . However, in 
Figure the discrete probabilities pi are not dense, and 
the relationship between P(> I) and pi is not the same 
as in the continuous case. As li becomes large, Eq. <|21[) 
indicates li+i » U, and Pi+\ « pi- The cumulative 
probability distribution becomes the same as the proba- 
bility density in the tail: 



(24) 



law behaviors in the small and large event size limit de- 
rived analytically are apparent. Notice that there are 
only a few points in the slope = -1 regime in Figure |BJ a). 
This is because the event sizes are increasing exponen- 
tially as shown in Eq. I|21|) . We can populate the tail 
of this distribution by combining many data sets with 
slightly different initial cut positions, c,;, Ci—i, and the 
results are shown in Figure This models a mixture of 
data from systems with the same number of resources N 
but different large scale cutoffs, A. 

Figure |SJa) and (b) also show the probability density 
p(li) (small squares) vs. the event size k. For small I, the 
cumulative probability has a steeper slope than the prob- 
ability density, while for large I the points are very nearly 
the same and have the same slope. This occurs because 
the probability density for the cuts model is inherently 
discrete. As discussed in Section III, if a probability den- 
sity is smooth and continuous, the corresponding cumu- 



This asymptotic behavior is verified in Figure E{a) and 
Figure [7| 



Cuts model for a power law distribution of sparks 

We can also analytically solve for the optimal cut po- 
sitions in the case of a power law distribution of sparks: 
p(x) = ax-( a+1 ^ H3. Using the same procedure as in 
the exponential case, we find the following results for the 
discrete probabilities and the corresponding event sizes: 



Pi = (c l -i)- a - { Ci )' a 



so that 



U = 



a(c t )-( a + 1 1 



(25) 



(26) 
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FIG. 8: Cumulative probability P(> I) (Large circles) and 
probability density p(l) (small squares) vs. event size / for 
the cuts model with power law spark probability density with 
parameter a = 2 (in p(x) = ax~ ( - a+1 ^). Points are calculated 
from Eq. 1261 iterated forwards 2000 times from the initial 
cut positions Ci_i = l,Cj = 1.001. The solid line illustrates a 
power law with exponent —4/3, and the dashed line illustrates 
exponent — 1. 



For a > 1, the slope of P(> I) vs. I on a log-log plot 
approaches — 2a/ (a + 1) as / becomes small. As I be- 
comes large, the slope approaches — 1. These asymp- 
totic relationships are derived in Appendix A. In addi- 
tion, as a approaches infinity the initial probability den- 
sity p(x) = ax~( a+1 ) decays faster than any power law. 
Notice that in the limit a — > oc we recover —2 as the ex- 
ponent for the cumulative probability distribution, which 
is exactly the same as the exponential result. 

We also investigate the event size distribution for 
power law p(x) by solving the recursion relation in 
Eq. (|26fl numerically. Figure [S] shows the cumulative 
(large circles) and noncumulativc (small squares) event 
size distributions for a power law spark distribution with 
a = 2 (i.e. p(x) = ax^^ a+1) = 2.t~ 3 ). For small I this 
leads to a cumulative probability distributions of event 
sizes that has a shallower slope than the corresponding 
data for the exponential spark density (Figure^ but still 
a steeper slope than for larger events. The slope of the 
cumulative distribution is close to the analytically cal- 
culated asymptotic value of a = —2a/(a + 1) = —4/3 
in the small event limit (Appendix A). For large I the 
slope is approximately —1. The corresponding data for 
the case a — ► 1 (p(x) ~ 1/x) has slope —1 for the entire 
range of event sizes. Additionally, solutions of the cuts 
model obtained for a power law distribution of sparks 
has the feature that the large event sizes k increase at 
a slower rate than in the corresponding exponential so- 
lution. Therefore we are able to see more points in the 
tail of Figure |H1 and easily confirm the slope —1 that we 
derive analytically (Appendix A). 



V. COMPARING MODELS 

We next make more direct comparisons between the 
continuum, PLR, and cuts models. Despite the appar- 
ent differences, we show that there are a variety of cases 
where one model can be used to approximate another. In 
these cases the resulting power laws match. However, in 
doing this we face several challenges: 

• The PLR and continuum models use the expected 
event size as the cost function: J = ^Pih (yield 
function Y in Eq. (|15[) ). The cuts model is 
most easily solved analytically for the cumulative 
cost function ,P = Y^Pi^i (yi em function Y t in 
Eq. JHl), where p\ = J2i<j< N Pj- 

• The continuum and cuts models specify a probabil- 
ity density p{x) which is a continuous function of 
the spatial position x, while the PLR model spec- 
ifies condition categories i with discrete probabil- 
ities Pi which have no a priori association with a 
position x. 

• The cuts and PLR models specify a set of discrete 
probabilities pi and corresponding set of discrete 
event sizes U, while the continuum model uses only 
continuous p{x) and l(x). 

• The cumulative distribution P(> I) is an analytic 
function of the probability density p(l) only if the 
density is a continuous function of event size. If 
instead the density p(l) (or pj(Zj)) is discrete, as it is 
for the cuts and PLR models, there is no universal 
analytic relationship between the cumulative and 
noncumulativc distributions. 

We address all these issues in the subsections that follow. 

Comparing results obtained for different cost 
functions 

To reconcile the cost functions of the different models 
we can either find solutions to a "J-cuts model" which 
uses the original cost function J, or we can adapt the 
PLR model to use the modified cost function J*. The 
recursion relation for the cuts model with the cost func- 
tion J, Eq. (|16f) is more difficult to solve, but fortunately 
we can determine the asymptotic behavior of this "J- 
cuts model" without solving those equations. This is 
because the asymptotic results in the simple exponential 
"J*-cuts model" (Eqs. (j22> and are valid for both 

cost functions J and J*. In particular, the optimal solu- 
tions {k} are asymptotically equal for the two costs (J*, 
J) in the limits U — > oo and U — > 0. Proof of this result 
is given in Appendix B. This implies that our results for 
the cuts model can be directly compared to the results for 
the PLR and continuum models in these limiting cases, 
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p(x) 




FIG. 9: We use the category size L to generate a piecewise 
constant function p(x) of position x from the discrete set of 
probabilities {pi} in PLR. 



as shown in Table I. Alternatively, we can modify the 
PLR model to use the same cost function as the " J'-cuts 
model" . This is particularly simple if the probability dis- 
tribution of sparks p(x) is exponential, since cumulative 
and noncumulative exponential distributions are propor- 
tional to each other. 
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Mapping of the PLR event categories to spatial 
positions 

To compare PLR to other models, we also must decide 
how to associate the discrete probabilities p, in the PLR 
model with positions x. In the PLR model, we derive 
scaling relations between resources and event sizes by 
imagining that each event category i is associated with 
a region of the same total length L, inside of which the 
probability is a uniform pi as illustrated in Figure^ (The 
length L is later divided up into optimal event sizes Zj.) 
This procedure is discussed in Section III. 

To construct a mapping from the event categories to 
the real axis, we can use this length L to derive a right- 
continuous piecewise constant probability function p(x) 
on the real line, as illustrated in FigureEl We order the Pi 
so that they are monotonically nonincreasing, associate 
each category i with a length L, and place the categories 
adjacent to one another on the real line. Then p(x) = p L 
whenever x £ [(i — l)L,iL). We can then use PLR 
formalism to calculate the optimal event sizes U within 
each category. This defines a event size function l(x) 
which describes the size of the loss which occurs when 
a spark hits position x. We define l(x) = U whenever 
x G [(i— 1)L, iL). Note that L is the maximum possible 
event size in PLR and the large-scale cutoff A is defined 
by L and the number of event categories n: A = nL. 

Intuitively, here it is helpful to think of the PLR model 
as a coarse-grained version of the cuts model. The piece- 
wise constant spark probability density p PLR (x) can be 
viewed as an approximation to some underlying continu- 
ous probability density p CMts (x) which has been averaged 
to produce a constant value over each interval of length 



FIG. 10: Comparing the PLR and cuts models, (a) Spark 
probability density functions on a semilog plot. The solid 
line represents the piecewise constant function p PLR (x) = 
jf ie -A{i-i/3)i iIe [(j _ i)L,iL),L = 1,A = log(4), and the 
dashed line represents the continuous probability function 
p cuts {x) = K 2 e-* x ,\ = log(4). Constants K x and K 2 
are chosen so the probability densities are normalized on 
xe[Q, A = 10]. (b) Cumulative probability P(> h) vs. event 
size U for the PLR model (large circles) and the cuts model 
(small squares). 

L. As L becomes smaller, PLR becomes a better approx- 
imation to the cuts model with a continuous p(x) . 

Comparing PLR and cuts 

The cuts and PLR models can compared in many 
regimes because they both produce inherently discrete 
event size distributions. The spatial mapping of event 
categories to spatial positions, and the approximation 
of a continuous p(x) = p cuts (x) (for cuts) by a piece- 
wise continuous p PLR (x) composed from the p,'s lead to 
excellent agreement between the two models for a wide 
range of p(x). For the cuts model we choose a continuous 
probability density: 

p cuts (x) = K 2 e~ Xx (27) 

For the PLR model we choose a probability density which 
is piecewise constant on intervals of length L: 

p PLR (x) = X 1 e" A ( J "( 1 / 2 » i , xe[(i-l)L, iL) (28) 

and chose these densities so that p cuts (x) matches 
p PLR (x) at the mid-point of each interval. The density 
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p PLR (x) can be thought of as a coarse-grained average 
of p cut8 (x). Graphs of these functions are shown in Fig- 
ure II Uf a). We can trivially modify the PLR model to use 
the same cost function J* because cumulative and non- 
cumulative exponential distributions are proportional to 
each other, implying p* oc p.j. 

Next we use the PLR model to find the optimal U, and 
thus Pi(li) and P(> k) for the spark probability density 
p PLR (x). We take L = 1 and a large scale cutoff A which 
is n = 10 times larger than L. The cumulative proba- 
bility P(> h) vs. event size k (large circles) is shown in 
Figure EJb). Note that P(> k) has a exponent of —2, 
which is exactly the same as the exponent for the non- 
cumulative probability Pi(k). Again, this is due to the 
discrete nature of Pi(k) and the exponential p(x) which 
is approximated by the piecewise constant p PLR {x). 

To obtain the corresponding solution for the event size 
distribution of the cuts model, we use the recursion re- 
lation (Eq. dJ) to compute the optimal li, p(li) and 
P(> li) for the continuous, exponential p(x) = p cuts (x). 
We choose the initial cut positions based on our solution 
for the largest event obtained for the corresponding PLR 
model above. In other words, we take Cj = A (the end- 
point of the interval on the real axis for the mapping of 
the PLR categories into position space) and c 2 _i = A— l n , 
where l n is the largest event size in the PLR model. We 
then iterate the recursion relation Eq. Ijl8(l backwards 
until we reach the cut at position x = 0. The cumulative 
probability P(> h) vs. event size U (small squares) is 
shown in Figure ITUT b't . P(> h) has a exponent of —2 
in agreement with PLR for the same cost function J*, 
and the corresponding spark distributions p cuts (x) and 
p PLR (x). 

Thus the cumulative probabilities for the cuts and PLR 
models are remarkably similar. This indicates that even 
outside the asymptotic regime {U — > 0), the cuts model 
and the PLR model match for an exponential spark prob- 
ability density. Note that in this example we are still in 
the regime where the cuts model solution P(> h) vs. U 
has a slope of —2 on a log-log plot — that is, the dense 
resource regime. 

Connections between the continuum model and the 
discrete PLR and cuts models 

We next compare the continuum model, which has a 
continuous event size function l(x), with the cuts and 
PLR models which both have a piecewise constant l{x), 
corresponding to the discrete U for these models (and the 
spatial mapping, in the case of PLR). The continuum 
model cannot be extended outside of the dense resource 
regime, because it builds in the assumption of a continu- 
ous event size function l(x). Interestingly, all three mod- 
els can be made to agree in the dense resource limit. For 
the PLR and cuts models, these correspond to regimes 




FIG. 11: Plot of probability density p(h) (small squares), 
and P(> h) (large circles) vs. event size U derived from the 
PLR model. The initial probability density p(x) is piecewise 
constant function over intervals of length L = 1. p(x) is 
defined so that the left-hand end-point of each interval has a 
value which fits an exponential density. 

in which the piecewise constant function l(x) becomes 
nearly continuous. We begin by comparing the contin- 
uum model to the cuts model. The cuts model predicts 
that for small event sizes (and thus dense resource allo- 
cations), the function l(x) will be close to continuous (as 
shown in Figure |3J). We showed earlier in Eq. i|23|l that 
in the limit ij — ► 0, the J -cuts model predicts a power 
law with exponent —2. In Appendix B, we show that the 
solution for the cuts model with the modified cost func- 
tion J* is the same as the solution for the cuts model 
with the original cost function J in this limit. There- 
fore that the cuts model matches the continuum model 
in the limit U — » 0, when the two models have the same 
cost function J. Note that even though l(x) is approach- 
ing a continuous function, p{x) remains discrete, so that 
the cumulative distribution of events P(> I) is in fact 
a steeper power law than the density in this regime, as 
illustrated numerically in Figure 

Next, to compare the continuum model to PLR, we 
note that in PLR, l(x) becomes close to continuous when 
the category size L becomes small and the event sizes 
U become very small. Formally, this corresponds to the 
limit L — > with U/L — > for every li and every L. As 
for the cuts model, this is a case where the discrete PLR 
model produces a nearly continuous event size function 
l{x), although, the event size probability density Pi(h) 
remains sufficiently discrete that computing the cumula- 
tive distribution P(> h) does not simply correspond to 
a unit increase in the exponent, and instead we must be 
cautious and do additional work to compute the cumula- 
tive exponent, we did for the PLR model in Eq. (|23|) and 
Appendix A. 

In d = 1 PLR predicts that the discrete event size prob- 
ability density is pi(h) oc l~ 2 , regardless of the density 
at which points in that density are sampled. Further- 
more the PLR model begins with the pi as input (solving 
for the k by optimizing resource allocations), so we must 
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work with the density first, then solve for the cumulative 
distribution,. Because the pi and U are discrete, there 
is no simple relationship between the density Pi{h) and 
the cumulative distribution P(> U). Naively, one might 
expect the cumulative probability to be the integral of 
the probability density and guess P(> U) oc . As 
we have stated previously, this is emphatically not the 
case. Figure El is a numerical simulation of PLR for a 
piecewise constant p(x) (using the mapping of event cate- 
gories into spatial positions, each of length L given by the 
width over which p(x) is constant), which is defined so 
that the left-hand end-point of each interval has a value 
which fits an exponential density function. This figure 
shows p(k) oc l^ 2 , as predicted, yet P(> U) oc as 
well. In other words, the cumulative and noncumulativc 
probabilities on a log- log plot both have a slope of —2. 

In fact, it is straightforward to show analytically that 
the cumulative slope matches the noncumulativc slope 
in this case. The PLR probabilities {pi} are given as 
exponentially distributed: pi oc e~ A ( 4-1 - ) L , where L is 
the category size, which is subdivided into regions of size 
ij. The large scale cutoff A is L x n, where n is the 
number of categories. We can calculate the cumulative 
probability: 

P(>1.) - ^ 3=% 

A/L 

j=i 
,A 

~ / e~ Xx dx 

J(i-l) L 

" A 

oc p t oc l~ 2 (29) 

where we have used the fact that because L — > the 
reciprocal of the norm, A(L), approaches zero and we 
can approximate the sum as an integral. We also drop 
the term proportional to e _AA , which is much smaller 
than e~ A( ' I ~ 1 ) L . Thus the cumulative distribution is pro- 
portional to the noncumulative distribution in this limit, 
and the continuum, cuts, and PLR models all match in 
the regime where resources are dense and event sizes are 
small. 

PLR and cuts for large events 

A final question is whether the PLR model is similar 
to the cuts model in the limit of very large event sizes, 
where the cuts model predicts P(> U) has a exponent 
of —1. As we mentioned earlier, in one dimension the 
PLR model predicts Pi(h) oc l^ 2 for every li and ev- 
ery L. However, cumulative distributions which result 



from discrete probability densities can have any one of 
a large class of shapes and exponents. For PLR to pre- 
dict a cumulative slope P(> k) oc (i.e. the same as 
the cuts model for large events), the discrete PLR points 
p(li) must be sufficiently dense so that the summation of 
those points approximates an integral. This occurs when 
the U increase very slowly, or equivalently if the spark 
probability density p(x) is very heavy tailed. For spark 
probability densities p(x) (such as the exponential) which 
drop off quickly, the U increase rapidly (see Eq. (|18J) ) and 
PLR will not predict a slope a = — 1 for an individual 
optimized system. 

Interes ting ly, most data from complex systems like for- 
est fires [Tof and web traffic Q are sufficiently dense 
that an integral approximation is reasonable. Cumula- 
tive slopes of a = — 1/d are consistent with the PLR 
model when interpreted as in Section III. As previously 
mentioned, this might best be explained by viewing these 
data sets as mixtures of data from systems which are in- 
dividually optimized. In this case a probability density 
with a sparsely populated tail (such as Fig.|5Ja)) might be 
mixed with similar data so that the tail becomes densely 
populated. This is precisely what is done in for the cuts 
model in Figure \7\ and here the mixture power law re- 
tains a = —l/d= —1. Thus it is possible that mixtures 
of PLR models could be made consistent with the cuts 
model in the limit of large event sizes. However, because 
PLR makes analytic predictions only for noncumulative 
probability densities Pi(h), in the absence of a more thor- 
ough analysis of mixtures of PLR solution, we can draw 
no further general conclusions about the behavior of cu- 
mulative probabilities P(> h) for large U in this paper. 
Instead, we reserve this issue for a more detailed analysis 
in 0. 

VI. PATHOLOGIES OF THE LATTICE MODEL 

Abstract forest fire models have arisen as paradigms 
in comple x sy stems theory, initially for the SOC mecha- 
nism IS HI El and later also for HOT Q, H GjJ. In- 
spiration for SOC comes from statistical physics, where 
lattice models have played a central role in theoretical 
explorations of large scale consequences of local interac- 
tions [iOl • HOT is motivated by biology and engineering, 
where lattice models are a less natural starting point. 
Nonetheless, in an effort to clarify comparisons between 
the mechanisms, and because of their pedagogical ex- 
planatory power, study of HOT also began with lattice 
models. However, in the limit of large lattices, the HOT 
lattice model can become somewhat pathological, which 
led to the alternative HOT models analyzed in this pa- 
per. In this section we discuss the nature of this pathol- 
ogy. It arises in what corresponds to a natural limit for 
percolation in statistical physics that goes awry in the 
analogous HOT model, because the difference in scaling 
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between the d— dimensional contiguous regions, and the 
(d — 1)— dimensional barriers. 

SOC builds on the concept of criticality in statisti- 
cal physics. The percolation phase transition is associ- 
ated with a critical density of occupied sites, at which a 
connected cluster of nearest neighbor occupied sites first 
spans the lattice (say, from top to bottom) in the limit of 
infinite lattice size. Infinitesimally above the critical den- 
sity, the infinite cluster exists with probability converg- 
ing to unity as the lattice size diverges. Simultaneously 
the probability any given site is connected to the infinite 
cluster converges to zero. This occurs because the infi- 
nite cluster is a fractal. An immediate consequence of 
the fact that the fractal dimension is less than the lattice 
dimension is that removal of the infinite cluster (i.e. in 
the largest possible fire) does not alter the lattice density 
even though the cluster is system-spanning (i.e. would 
stretch across the entire forest). At the critical density, 
and only at the critical density, the distribution of cluster 
sizes in the ensemble is described by a power law. 

In statistical physics power law predictions are typi- 
cally sharpened by taking the limit of infinite lattice size. 
However, in attempting this for the HOT lattice model 
a problem emerges, that makes the large lattice limit ill- 
posed. This also reveals more clearly an intrinsic flaw in 
the lattice model when it comes to modeling mechanisms 
and costs associated with suppression of fires and other 
cascading events in highly designed or evolved systems 
0- Consider the lattice model in d = 2. In both the 
HOT and SOC lattice model a firebreak forms when any 
unbroken chain of empty lattice sites isolates a connected 
cluster, even if the chain is only one lattice spacing wide. 
In SOC (and criticality) the underlying randomness with 
which configurations are generated, and the symmetry 
between vacant and occupied sites, results in a critical 
density of 0.4 (0.59) in d = 2 which is bounded away from 
unity, so that a finite fraction of the lattice is devoted to 
both clusters and firebreaks in the limit of infinite size. 
In other words, the size of the firebreaks scales in the 
same way as the size of the connected clusters. However, 
in the HOT version, simple optimization of yield (num- 
ber of trees remaining after a single spark, averaged over 
the spark distribution) leads to macroscopic, compact 
clusters of trees separated by narrow (one lattice spac- 
ing wide), efficient (linear) firebreaks. Thus in the limit 
of large lattices the cost in density and yield associated 
with each firebreak becomes vanishingly small. 

To visualize how the cost of firebreaks becomes negli- 
gible for large lattices and why this is a problem, con- 
sider large N x N lattices as N — > oo. A vertical line of 
empty sites extending from top to bottom on the lattice 
involves N sites, and so the cost in lattice density asso- 
ciated with making those sites vacant is N/N 2 = 1/N. 
This cut divides an otherwise fully occupied lattice into 
two separate regions (left and right of the firebreak) . In 
the limit N — > oo, the cost in density of the cut is zero, 



even though the division of the lattice into two separate 
regions is preserved. Similarly, a collection of equally 
spaced vertical and horizontal cuts on an otherwise oc- 
cupied lattice results in a gridded configuration dividing 
the lattice into square regions of equal size, each out- 
lined by a firebreak one lattice spacing wide on each of 
the four sides. For this configuration, all fires are of equal 
size (the area of the contiguous square). For a finite lat- 
tice such a solution could only be optimal for a spatially 
uniform distribution of sparks. However, in the limit 
TV — > oo an infinite family of such solutions all achieve 
the maximum yield of unity. All that is required is that 
the cuts be positioned far enough apart that the grid 
of firebreaks consume zero density, yet close enough to- 
gether that the density cost associated with a fire in any 
individual square of contiguous occupied sites is also zero. 
This is achieved whenever the spacing between grid lines 
scales like N 7 with < 7 < 1. This produces a yield 
of unburnt trees that is asymptotically perfect (i.e. ap- 
proaches unity) for the entire forest for any distribution 
of sparks, with infinitesimal fire sizes. 

It is straightforward to generalize this argument to 
higher dimensions, because it relies only on the fact that 
the barriers scale differently (like d — 1) compared to 
the the compact regions (like d). Unrealistically, a lit- 
eral interpretation of the lattice model suggests that with 
proper management and minimal cost, essentially all fires 
could be eliminated 0. While this form of the HOT lat- 
tice model is useful pcdagogically as it exhibits such strik- 
ing differences from the SOC version, it has too many 
flaws to be taken literally as a model of real forest fires 
because the costs of resources for suppression are not ac- 
counted for properly. While there is a natural duality 
between vacant and occupied sites in the models of sta- 
tistical physics, in HOT models vacancies are resources 
which define boundaries that scale differently than the 
bulk substrate. For specific applications, resources are 
rarely (if ever) simply the absence of substrate. Even 
firebreaks constructed on forest land (e.g. roads) are not 
simply the absence of trees, but are cut and maintained 
at significant economic expense. 

VII. DISCUSSION 

The abstract HOT models studied in this paper cor- 
rect the pathology of the original HOT lattice model by 
explicitly accounting for resource use. The PLR and 
cuts models do this through an explicit cap on the to- 
tal resources available. The continuum model does this 
through inclusion of an explicit resource cost term in the 
yield function. Several preliminary calculations suggest 
that at least within a range of functional representations, 
the specific manner in which resources are accounted for 
is not a crucial factor in determining the exponent in the 
power law for these models. For example, a more gen- 
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eral cost-benefit term describing resource use can replace 
the explicit cap on resources in the PLR model, at the 
expense of analytical tractability of the model, but with 
no significant change in the exponent. Analogously, the 
cuts model (in the limit of small event sizes) and the 
continuum model can lead to the same power law expo- 
nent, in spite of the fact that they account for the cost 
of resources in different ways. 

The key feature in determining the size distribution for 
a given model is that we optimize, while measuring the 
cost (or loss) in terms of the average event size. Alterna- 
tive formulations of the continuum model have con- 
sidered alternative cost or utility functions, which clearly 
can lead to modifications in the event size distribution. 
For example, if the cost function puts a large penalty for 
events greater than a given size, then more resources will 
be devoted to large events, at the expense of more smaller 
events, and a great average size. Such considerations are 
clearly relevant in cases such as finance and economics, 
where risk-seeking and risk-averse strategies come into 
play. 

Compared with models based on criticality, the power 
laws predicted by all of the HOT models are much 
steeper, and have the opposite trends with dimension- 
ality. In criticality the exponents become smaller for 
lower dimensional problems. This is the opposite of 
the trends observed in data Q, which typically exhibit 
steeper power laws for lower dimensional problems, as in 
HOT. It is worth noting (especially given our focus on 
d = 1) that while percolation in d = 1 has the (triv- 
ial) critical density of unity- the only way connectivity 
can arise across a one dimensional lattice is for every site 
to be occupied- the configurations and size distribution 
(not a power law in d = 1) which arise in random per- 
colation in the neighborhood of the critical density even 
in that case are completely unlike those that arise in the 
corresponding one-dimensional HOT lattices. In criti- 
cality, the placement of vacancies is random, whereas in 
HOT the specific placement of vacancies is dictated by 
optimization. 

In models based on criticality, the self-similar, frac- 
tal event shapes, reflect a mechanism which is intrinsi- 
cally scale- free, producing a single exponent, spanning all 
scales. In contrast, in HOT models heavy-tailed distri- 
butions arise from optimization on a macroscopic scale. 
Compact regions predicted by HOT are not fractal or 
self-similar and there is no reason to expect that small 
scale events will a priori be described by the same power 
law as large scale events. 

The cuts model is a clear example in which we do ob- 
serve a heavy-tailed event size distribution, with asymp- 
totically different power law exponents as we vary the 
scale. This model highlights the essential difference be- 
tween the dense and sparse resource regimes, which in 
the original formulations of the continuum and PLR 
models emerge from the distinction between inherently 



continuum and discrete fields describing probabilities, 
resources, and losses. In the continuum case, it is 
simply not possible to capture features which could 
arise as a consequence of discrete, sharp, well-separated 
boundaries- the sparse resource regime. Thus the con- 
tinuum model agrees with the cuts model only in the 
limit that the cuts (which are sharp and discrete) are 
placed asymptotically close together, i.e. the dense re- 
source limit. On the other hand, the PLR model, which 
assumes discrete event categories, can in principle cap- 
ture both the dense resource limit and the sparse resource 
regime, though the latter will need additional treatment 
because of the intrinsic role that mixtures play in real 
data. In this paper we explored the PLR model in the 
limit of dense resources, by taking the length scales of 
the system L and the event sizes U/L simultaneously to 
zero. In this limit, the PLR model can capture the the 
continuous, spatial spark distribution p(x), though PLR 
(and cuts) remain intrinsically discrete. 

Based on this analysis, it may appear that the cuts 
model is the clear winner, simultaneously capturing the 
full range of behaviors seen in the other two, and this 
would be true if we only considered d = 1. However, in 
order to generalize the cuts model beyond d = 1 it is 
necessary to constrain the optimization procedure. For 
example, in 0] this was done by specifying a grid de- 
sign. In many cases, such a constrained design may not 
be desirable, and the abstractions of the other models 
may be preferred. The continuum and PLR models are 
both easily formulated in arbitrary dimension d, but with 
different predictions for the exponents. As we've shown 
here, the PLR model can be extended to the dense re- 
source regime, where it agrees with the predictions of the 
continuum model. The reverse is not the case. In that 
sense, the continuum model is less flexible. Furthermore, 
the PLR model has been far more successful in capturing 
statistics of event size distributions, assuming data sets 
are dense enough to be described as continuous distribu- 
tions (e.g. assuming they are mixtures 0). Examples 
which have been studied include world wide web traffic, 
forest fires, and power outages 0,0, 0. 

In comparison, we do not yet have any clear exam- 
ples where the predictions of the continuum model have 
been shown to apply. Perhaps the reason behind this lies 
in the fact that data is almost exclusively collected for 
large events in the sparse resource regime. In regimes 
where resources are abundant, one may simply choose 
not to optimize. Small file downloads, fires, and outages 
are rarely monitored, and small scale cutoffs, whether 
deliberately imposed for convenience or arising from an 
inherent physical mechanism, tend to prevent detailed 
statistical analysis of this regime. In any case, statistical 
distributions remain only a starting point for understand- 
ing mechanisms for complexity and modeling system fail- 
ure. Success arises from the study of simple models when 
their predictions capture aspects of the system which can 
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be described and quantified at a relatively low resolution. 
From this initial success, they can inspire a sequence of 
higher resolution models and observations to understand 
and anticipate detailed mechanisms for cascading failure 
in natural and technological systems. 



APPENDIX A: ASYMPTOTIC LIMITS OF THE 
CUTS MODEL FOR A POWER LAW INITIAL 
PROBABILITY DENSITY 



In this appendix we derive the slope of P(> h) on a 
log-log plot for a cuts model where the initial probability 
density is described by a power law, p(x) = ax~t a+1 \ 
First we use the cuts model to find an analytic description 
for the set of discrete probabilities pi and event sizes If. 



Pi 



k = 



( Cl ^)- a - (c,)- a 

P(Ct) 

a( Cl )-> +1 ) 



(30) 



We also recall the definitions for the cuts positions c, 
and the cumulative probabilities p* = P(> k): 



P, 



Ci-l + k 

p(> k) = (c,-ir a 



(31) 



The slope of P(> k) on a log-log plot can be calculated 
in limiting cases by dividing Alogp- by AlogZj, Taylor 
expanding and dropping higher order terms. 
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AlogZj log? ?+ i - logk 
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Now we will assume that U is small compared to c,_i 
and we will derive terms which can be Taylor expanded 
to first order in -^ L - . We first evaluate numerator of 
Eq. (33): 



-alog(ci_i + k) + alog(ci_i) 

= alogCj_i - olog(l + ) + alog(cj_i) 

Ci-l 
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Now we evaluate the denominator: 



(33) 
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(34) 



We assume li/ci—i « 1 and use the binomial expansion 
on the last term. Then Eq. (|34fl becomes: 
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Inserting the numerator and denominator back into 
Eq. we have: 
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-a log 1 
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Now we use the Taylor expansion log(l + e) = e + 0(e 2 
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(37) 



(38) 



This is the slope of P(> Zi) on a log-log plot in the limit 
where I becomes small. 

Now we will look in the opposite limit, where I becomes 
large. We first show that I — > oo implies ^f^- — * if 
a > 1. Using the definition for li+i in Eq. (|30|) we derive 
a recursion relation for Ci/li+i = gi. 
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By definition: 
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act (cQ-'fc)- 1 
(ci_i)-» - (c 4 )- Q 

g (cf_i + Zj)~ a 
(ci-x)- - (ci_i +Z 4 ) _a 



all Ci — i 



£ 17 i + 1 



a(^_i + 1) 



(G/i-i)-°-(5i-i + ir°) 



(39) 



We note that t/i will always be less than 1. Therefore we 
can use the binomial expansion and write out the terms 
to lowest order in gi-\\ 



a (1 - ag^ + Q{g 2 _i)) 
{ 9i -i)- a - (1 - a 5i _i + O^ii)) 



(40) 



Now we note that if we assume g%-i << 1 for large i 
we can drop all terms of order g?_ v Also, for a > 1 the 
first term in the denominator will be much larger than 
the other terms, and we drop all the other terms in the 
denominator. Then we have: 

^ "(i-y) „ oto _ l} . + a2(5i _ l)a+1 (4i) 

We can then find the ratio of consecutive terms: 



gi/gi-x a (g.^Y 1 +a 2 (g l - 1 ) a 



(42) 



Because gi < 1 for all gi, we see that our assumption 
that gi « 1 was indeed valid, and that the sequence 
goes to zero as i approaches infinity. We also note that if 
a < 1, we can no longer assume that the first term in the 
denominator in Eq. I|4U|1 is much larger than 1. In fact, 
as a — > the first term in the denominator approaches 1, 
and it is not true that ^F 1 approaches zero for large Zj. 

We now solve Eq. I|32() for terms which we can Tay- 
lor expand to first order in ^j-^-- First we simplify the 
numerator: 
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(43) 



where in the last line we have used the Taylor expansion 
log(l + e) = e + 0(e 2 ). As ^f 1 - approaches 0, log be- 
comes large and negative, and the terms inside the braces 



in Eq. O beco me negligible. Therefore the numerator 
in this limit is: 
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We simplify the denominator of Eq. (|32[l . 
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Again we use the binomial expansion to approximate the 
last term in the denominator: 
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where in the last line we have used that (%~) a » 1 > 
( ) ■ Then the denominator fEa. I45|) becomes 
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where in the last line we used the Taylor expansion for 
log(l + e). Again we see that for any finite a the terms in- 
side the braces in the last line of Eq. 1)47(1 become negligi- 
ble compared to log (jj^j m the- limit that ^j^- becomes 
small. In this limit the denominator can be approximated 
as: 



Alog/i ~ -a log 



Cj-l 

k 



(48) 



Dividing Eq. jUJ by Eq. we see that for small ^f^, 
the slope of P(> I) on a log-log plot is: 

Alogp* 



AlogZi 



(49) 
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We have shown that if a, the exponent for the power law 
spark distribution, is greater than 1, then - J j^- becomes 
small as becomes large. In this case we have shown — 1 
to be the asymptote of the exponent of P(> I) for large 
I. 

APPENDIX B: OPTIMAL SOLUTIONS FOR 
TWO DIFFERENT COST FUNCTIONS 

We are interested in comparing the optimal solutions 
for two different cost functions (equivalcntly, for two dif- 
ferent yield functions Y in Eq. (|TB|l and Y* in Eq. (I7fr). 
The first cost function J = "^PiU equates cost with ex- 
pected event size, and is used in PLR and continuum 
models. The second cost function J* = Y^Pih equates 
cost with expected transferred event size and is used in 
the cuts model. This situation arises when the frequency 
with which an event is "transferred" is equal to the cu- 
mulative probability of all larger events. One example is 
sequentially linked web files. Though J* is less intuitive 
than J, it has the very nice property that one can an- 
alytically solve for the optimal event sizes U given cost 
function J*. In most situations, however, we are really 
interested in optimizing the original cost function J. 

In this section we will show that the optimal solutions 
{li} are the same for either definition of cost (J*, J) in 
the limits U — > oo and li — ► 0. This allows us to directly 
compare analytic results from the cuts model with results 
from continuum and PLR models in limiting cases. 

First, we recall that optimizing J leads to a recursion 
relation for optimal event sizes li, 

Pi + [kp(ci) - Pi+i] = k+i p{ci) (50) 

while optimizing J* leads to a different recursion rela- 
tion. 



Then Eq. I)52ll can be rewritten as 

(p(Cj) -Pavg)k « Pavg k+1 (54) 

We use the recursion relation for the event sizes given 
by Eq. (gT(): 

e \U _ i 

k+i = (55) 

In the limit Zj — > this implies k+i = li + 0(lf). Ne- 
glecting terms of order If we have 

(p(Cj) - Pavg) « Pavg (56) 

The position Ci approaches the midpoint of the inter- 
val [cj_i,Ci+i] because k + i — > k. As the length of the 
interval goes to 0, the value of p(x) at the midpoint, 
p(ci), approaches the average value of p(x) over the inter- 
val. Therefore the left hand side of Eq. (j56(l is negligible 
compared to the right-hand side. In the limit li — > 0, e is 
much smaller than pi. 

Now we will show that this is the case for the limit as 
li — > oo. Starting from Eq. 1152H we again want to show: 

/•c i+ i 

k p(ci) « Pi+i +Pi= p(x)dx (57) 

Substituting Xe~ Xx for p(x) we have 

k\e- Xc > « -e- Xc ' +1 + e-^- 1 

« e -^i-^ e -HU+h+i) + !) 

« e-^- 1 (58) 
where we have used e~ xli « 1. Then we have 



Pi = Z i+ i p(ci) (51) 

Comparing Eq. (|51|l and l)50[l. we see they give the 
same result if the bracketed term in Eq. H5U|) . e = lip(ci) — 
Pi+i, is much smaller than pi. First we will show this is 
the the case in the limit li — ► 0. 

We want to show: 

e = lip{ci) - p i+ i « pi 

kp(ci) « pi +p i+ i (52) 

We can rewrite the right-hand side as 

Pi +Pi+1 = / p(x)dx = Pavg(k+1 + k) (53) 

J Ci-! 

where p aV g is the average value of p(x) on the interval 

[Cj_i,Cj + i]. 



log(Aij) - Ac, « — Xci-i 
log{Xk) « XI, 

(59) 

For li — > oo and A fixed, log(A^) is negligible compared 
to Xli. Therefore in the limit li — > oo, e is much smaller 
than pi . 

Therefore in these two limits, optimizing J* results in 
the same optimal event sizes as optimizing J. 
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